Smoothl1lossgrad

计算 Smooth L1 Loss 操作的梯度。该算子是 Smooth L1 Loss 算子的反向传播（backward pass）部分。

Smooth L1 Loss 是 L1 Loss 和 L2 Loss 的平滑组合，在损失值较小时使用 L2 Loss，在损失值较大时使用 L1 Loss，以减少异常值的影响。

\[\text{diff}_i = \text{x1}_i - \text{x2}_i\]

\[\begin{split}\text{dx1}_i = \begin{cases} \text{dy}_i, & \text{if } \text{diff}_i > \beta \\ -\text{dy}_i, & \text{if } \text{diff}_i < -\beta \\ \frac{\text{diff}_i}{\beta} \times \text{dy}_i, & \text{if } -\beta \leq \text{diff}_i \leq \beta \end{cases}\end{split}\]

其中 x1 是预测值（predict），x2 是目标值（target），dy 是来自后一层的上游梯度，dx1 是对预测值 x1 的梯度。beta 是平滑参数，控制从 L2 Loss 到 L1 Loss 的过渡点。

输入：

dy - 来自后一层的上游梯度数据地址。
x1 - 前向传播时的预测值数据地址。
x2 - 前向传播时的目标值数据地址。
length - 计算长度。
beta - 平滑参数，控制从 L2 Loss 到 L1 Loss 的过渡点。通常取值范围为 0.1 到 1.0。
core_mask - 核掩码（仅共享存储版本需要）。

输出：

dx1 - 计算出的对预测值 x1 的梯度数据地址。

支持平台：

FT78NE MT7004

备注

FT78NE 支持fp32
MT7004 支持fp16, fp32

共享存储版本:

void fp_smoothl1lossgrad_s(float *dy, float *dx1, float *x1, float *x2, int length, float beta, int core_mask)

void hp_smoothl1lossgrad_s(half *dy, half *dx1, half *x1, half *x2, int length, half beta, int core_mask)

C调用示例：

//MT7004示例
#include <stdio.h>
#include <smoothl1lossgrad.h>

int main(int argc, char* argv[]) {
    // 假设在DDR空间
    float *dy = (float *)0xA0000000;   // 上游梯度
    float *x1 = (float *)0xA1000000;   // 预测值
    float *x2 = (float *)0xA2000000;   // 目标值
    float *dx1 = (float *)0xB0000000;  // 输出梯度（对 x1 的梯度）

    int length = 1000;
    float beta = 1.0f;  // 平滑参数
    int core_mask = 0xff;

    fp_smoothl1lossgrad_s(dy, dx1, x1, x2, length, beta, core_mask);
    return 0;
}

私有存储版本:

void fp_smoothl1lossgrad_p(float *dy, float *dx1, float *x1, float *x2, int length, float beta)

void hp_smoothl1lossgrad_p(half *dy, half *dx1, half *x1, half *x2, int length, half beta)

C调用示例：

//MT7004示例
#include <stdio.h>
#include <smoothl1lossgrad.h>

int main(int argc, char* argv[]) {
    // 假设在L2空间
    float *dy = (float *)0x10000000;   // 上游梯度
    float *x1 = (float *)0x10001000;   // 预测值
    float *x2 = (float *)0x10002000;   // 目标值
    float *dx1 = (float *)0x10003000;   // 输出梯度（对 x1 的梯度）

    int length = 1000;
    float beta = 1.0f;  // 平滑参数

    fp_smoothl1lossgrad_p(dy, dx1, x1, x2, length, beta);
    return 0;
}